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Abstract 



We study the question of how to shuffle n cards when faced with an opponent who knows the initial 
position of all the cards and can track every card when permuted, except when one takes K < n cards at 
a time and shuffles them in a private buffer "behind your back," which we call buffer shuffling. The prob- 
lem arises naturally in the context of parallel mixnet servers as well as other security applications. Our 
analysis is based on related analyses of load-balancing processes. We include extensions to variations 
that involve corrupted servers and adversarially injected messages, which correspond to an opponent 
who can peek at some shuffles in the buffer and who can mark some number of the cards. In addition, 
our analysis makes novel use of a sum-of-squares metric for anonymity, which leads to improved per- 
formance bounds for parallel mixnets and can also be used to bound well-known existing anonymity 
measures. 

1 Introduction 

Suppose an honest player, Alice, is playing cards with a card shark, Bob, who has a photographic memory 
and perfect vision. Not trusting Bob to shuffle, Alice insists on shuffling the deck for each hand they 
play. Unfortunately, Bob will only agree to this condition if he gets to scan through the deck of n cards 
before she shuffles, so that he sees each card and its position in the deck, and if he also gets to watch her 
shuffle. It isn't hard to realize that, even though several well-known card shuffling algorithms, like random 
riffle shuffling |1], top-to-random shuffling [5], and Fisher- Yates shuffling ifTTTl . are great at placing cards 
in random order, they are terrible at obscuring that order from someone like Bob who has memorized the 
initial ordering of the cards and is watching Alice's every move. Thus, these algorithms on their own are 
of little use to Alice. What she needs is a way to shuffle that can place cards in random order in a way that 
hides that order from Bob. We refer to this as the anonymous shuffling problem. Our goal in this paper is to 
show that, as long as Alice has a private buffer where she can shuffle a subset of the cards, she can solve the 
anonymous shuffling problem. 

Our main motivation for studying the anonymous shuffling problem in this paper comes from the prob- 
lem of designing efficient parallel mixnets. A parallel mix network (or mixnet) is a distributed mechanism 
for connecting a set of n inputs with a set of n outputs in a way that hides the way the inputs and outputs 
are connected. This connection hiding is achieved by routing the n inputs as messages through a set of M 
mix servers in a series of synchronized rounds. In each round, the n inputs are randomly assigned to servers 
so that each server is assigned K = n/M messages. Then, each server randomly permutes the messages it 
receives and performs an encryption operation so that it is computationally infeasible for an eavesdropper 
watching the inputs and outputs of any (honest) server to determine which inputs are matched to the outputs. 
The mixnet repeats this process for a specific number of rounds. The goal of the adversary in this scenario 
is to determine (that is, link) one or more of the input messages with their corresponding outputs, while the 
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Figure 1: (a) A parallel mixnet with n = 16 inputs and M = 4 mix servers. Shaded boxes illustrate mix 
servers, whose internal permutations are hidden from the adversary. The adversary is allowed to see the 
global permutation performed in each round, (b) A corrupted parallel mixnet, where s = 3 servers are not 
colluding with the adversary, who has injected / = 3 fake messages into the network. 

mixnet shuffles so as to reduce the linkability between inputs and outputs to an acceptably small level. (See 
Figure [T^.) 

Each of the servers is assumed to run the mixnet protocol correctly, which is enforced using crypto- 
graphic primitives and public sources of randomness (e.g., see (3j|71[T0l[T4j[T5l). In some cases, we also 
allow for a corrupted parallel mixnet, where some number s > 1 of the servers behave properly, but the 
remaining M — s servers collude with the adversary so as to reveal how they are internally permuting the 
messages they receive. In addition, the adversary may also be allowed to inject some number, / < n, of 
fake messages that are marked in a way that allows the adversary to determine their placement at any point 
in the process, including in the final output ordering. (See Figure [TJ5.) In this paper, we are interested in 
studying a class of algorithms for anonymous shuffling, to show how the analysis of these algorithms can 
lead to improved protocols for uncorrupted and corrupted parallel mixnets. 

1.1 Previous Related Work 

Arkin et al. [2] describe an attack on online poker based on exploiting a poor shuffling algorithm, among 
other security weaknesses. 

Chaum [4 ] introduced the concept of mix networks for achieving anonymity in messaging, and this work 
has led to a host of other papers on the topic (e.g., see lfl4l[T5TD . 

Golle and Juels [7 ] study the parallel mixing problem, where mix servers process messages synchronously 
in parallel rounds, and discuss the cryptographic primitives sufficient to support parallel mixnet functional- 
ity. Their scheme has a total mixing time of 2n(M — s + l)/M and a number of parallel mixing rounds that 
is 2(M — s + 1), assuming that M 2 divides n. It achieves a degree of anonymity "close" ton — f, using a 
specialized anonymity measure, Anon t , that they define (which we discuss in more detail in Section|2]). Note 
that if s is, say, M/2, then their protocol requires as many rounds as the number of servers, which dimin- 
ishes the potential benefits of a parallel mixnet. In particular, their approach uses sequences of round-robin 
permutations (cyclic shifts) rather than the standard parallel mixing protocol described above and illustrated 
in Figure[T] Even then, Borisov [3 ] shows that their scheme can leak linkages between inputs and outputs (as 
can the standard parallel mixing protocol if the number of rounds is too small) if the vast majority of inputs 
are fake messages introduced by the adversary. Thus, it is reasonable to place realistic limits on how large / 
can be, such as / < n/2, and require that the number of parallel mixing rounds is high enough to guarantee 
a high degree of anonymity. Klonowski and Kutylowski iPTOl also study the anonymity of parallel mixnets, 
characterizing it in terms of variation distance against the entire distribution for all messages. They consider 
honest servers and the case of a single corrupted server. They do not consider an adversary who can inject 



2 



fake messages, however, and they only treat the case when M 2 is much less than n. Nevertheless, they show 
a lower-bound that says that full-distribution anonymity requires more than a constant number of rounds in 
the case of dishonest servers. Our approach avoids this lower bound, however, by focusing our measure of 
anonymity on the obfuscation provided to individual messages, as in other papers on parallel mixing (e.g., 
see Q [H), rather than to the entire distribution of messages, which allows us to achieve anonymity in a 
contant number of rounds in some cases even in the presence of dishonest servers. In addition, our focus on 
individual message anonymity is also motivated by the stated goal of mixnets, which is to provide anonymity 
to individuals sending messages through a parallel mixnet by combining their messages together, rather than 
to provide anonymity for the entire set of users as a group. 

Goodrich, Mitzenmacher, Ohrimenko, and Tamassia [8 ] study a simple variant of the anonymous shuf- 
fling problem with no corrupted servers or fake cards, addressing a problem similar to parallel mixing in the 
context of oblivious storage. They show that when the number of cards per server each round is K = n 1 ^, 
then c + 1 rounds are sufficient to hide any specific initial card, so that the adversary can guess its location 
with probability only 1/n + o(l/n). The current work provides a much more general and detailed result, 
using much more robust techniques. 

Our techniques are based on work in dynamic load balancing by Ghosh and Muthukrishnan [6 ]. In their 
setting, tasks are balanced in a dynamic network by repeatedly choosing random matchings and balancing 
tasks across each edge. Here, we extend this work by choosing random subcollections of K cards and 
balancing weights, corresponding to probabilites of a specific card being one of those K, among the K 
cards via the shuffling. 

1.2 Our Results 

We study the problem of analyzing parallel mixnets in terms of a buffer-based solution to the anonymous 
shuffling problem, assuming, as with other works on parallel mixnets j3l|7l[H[T0]], that cryptographic prim- 
itives exist to enforce re-encryption for each mix server, along with public sources of randomness and 
permutation verification so that servers must correctly follow the mixing protocol even if corrupted. 

In the buffer shuffling algorithm ||8j[T0l, Alice repeatedly performs a series of shuffling rounds, as in the 
parallel mixnet paradigm. That is, each round begins with Alice performing a random shuffle that places 
the cards in random order (albeit in a way that the adversary, Bob, can see). Then she splits the ordered 
cards into M piles, with each pile getting K = n/M cards. Finally, she randomly shuffles each pile, using 
a private buffer that Bob cannot see into. Once she has completed her private shuffles, she stacks up her 
piles, which become the working deck for the next round. She repeats these rounds until she is satisfied that 
the deck is sufficiently shuffled for the adversary. Note that during her shuffling, Bob can see cards go in 
and out of her buffer, but he cannot normally see cards while they are in the buffer. As we describe in more 
detail shortly, Alice's goal is to prevent Bob from being able to track a card; that is, Bob should only be able 
to guess the location of a card with probability 1/n + o(l/n), where generally we take the o(l/n) term to 
be 0(l/n b ) for some b > 1. 

To characterize the power of the adversary in the parallel mixnet framework, we consider buffer shuffling 
in a context where, for M — s specific uses of Alice's buffer within each round, Bob is allowed to see how 
the cards are shuffled inside it. Likewise, we assume he is allowed to mark / < n of the cards in a way that 
lets him determine their position in the deck at any time. We provide a novel analysis of this framework, 
and show how this analysis can be used to design improved methods for designing parallel mixnets. For 
instance, we show that buffer shuffling achieves our goal with 0(1) rounds even if the number of servers 
is relatively large and that buffer shuffling can be performed in O(logn) rounds even for high degrees of 
compromise. We summarize our results and how they compare with the previous related work in Table [T] 
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Table 1: Summary of our results. We compare with the solutions of Golle and Juels Q, Klonowski and 
Kutylowski iflOl , and Goodrich et al. |8). The server restriction column refers to the parameter c in the 
inequality K > n l / c . The bounds of "medium" and "high" for corruption tolerance are made more precise 
in the statement of the theorems. Our theorems, as well as those of Golle and Juels Q and Goodrich et 
al. (H, address anonymity from the perspective of individual messages; Klonowski and Kutylowski IflOl 
address anonymity for the entire distribution of messages. 

2 Anonymity Measures 

We can model the anonymous shuffling problem in terms of probability distributions. Without loss of 
generality, we can assume that the initial ordering of cards is [re] = (1,2,..., n). After Alice performs 
t rounds of shuffling, let Wi(t, c) denote the probability from the point of view of the adversary that the 
card in position i at time t is the card numbered c, and let W(i, t) denote the distribution defined by these 
probabilities (we may drop the i and t if they are clear from the context). The ideal is for this probability to 
be 1/n for all i and c, which corresponds to the uniform distribution, U. 

A natural way to measure anonymity is to use a distance metric to determine how close the distribution 
W is to U, for any particular card i or in terms of a maximum taken over all the cards. The goal is for this 
metric to converge to quickly as a function of t. 

Maximum difference. The maximum-difference metric, which is also known as the metric, special- 
ized to measure the distance between W and U, is 

a(t) = max \wi(t, c) — l/re| . 

i,c 

As mentioned above, the goal is to minimize a(t), getting close to as quickly as possible. 

Note that, in the case of buffer shuffling, the formula for a(t) can be simplified. In particular, since 
Alice starts each round with a random permutation, Wi(t, c) = Wi(t, 1). Thus, in our case, we can drop the 
c and focus on Wi(t), the probability that the i-th card is 1. In this case, we can simplify the definition as 

a(t) = m&x\wi(t) — l/n\. 

i 

The Anon measure of Golle and Juels. In the context of parallel mixing, Golle and Juels [7] define a 
measure for anonymity, which, using the above notation, would be defined as follows: 

Anonj = min (max Wi(t, c)\ , 
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which they try to maximize. Note that max c Wi(t,c) > 1/n for all i, so, to be consistent with the goals for 
other anonymity measures, which are all based on minimizations, we can use the following Anon'j definition 
for an anonymity measure equivalent to that of Golle and Juels: 

Anon^ = max ( max Wi(t, c) ) = max?«i(t,c) = (Anon^) -1 . 

i V c / i,c 

The Anon'j measure is not an actual distance metric, with respect to U, however, since its smallest 
value is 1/n, not 0. In addition, it is biased towards the knowledge gained by the adversary for positive 
identifications and can downplay knowledge gained by ruling out possibilities. To see this, note that, if we 
let W + denote all the Wi(t, c)'s that are at least 1/n and W~ denote all the Wi(t, c) values less than 1/n, 
then 

ait) = max{ max {wAt, c) — 1/n} , max f 1/n — Wi(t, c)\\ 
w,(t,c)ew+ w x (t,c)ew- 

= maxjAnonJ — 1/n , max {1/n — wAt, c)}}. 

Wi(t,c)GW- 

For example, suppose Alice shuffles the cards in a way that guarantees that the first card is not the Ace 
of Spades but is otherwise as close to uniform as possible. Then, a(t)x = 1/n, since w±(t, 1) = 0, and 
Anon£ = l/(n — 1) in this case. Alternatively, suppose Alice's shuffle is uniform except that it results in 
wx(t,l) = l/(n- 1) and tui (i,c) = l/(n- 1) - l/(n- l) 2 , for c / 1. In this case, a(t) = l/(n(n- 1)) 
while Anon[ = l/(n — 1), as in the other example. The second example is much closer to uniform than the 
first and doesn't allow Bob to rule out any specific card as being the first card, but the Anon£ measure (and, 
hence, the Anon t measure) is the same in both cases. Therefore, we prefer to use anonymity measures that 
are based on metrics and are unbiased measures of the distance from W to the uniform distribution, U. 

Variation Distance. Li et al. lTT2l introduce a notion of anonymity called threshold closeness or t-closeness. 
For categorical data, as in card shuffling and mixnets, this metric amounts to the variation distance between 
the T^-distribution defined by Alice's shuffling method and the (desired) uniform distribution, U, where 
each card occurs with probability 1/n (see also iPTOl ). In particular, this metric would be defined as follows 
for buffer shuffling: 

1 n 

m = «£Mt)-iM 
z i=i 

which is the same as half the L\ distance between the VF-distribution and the uniform distribution, U. As 
with other distance metrics, the goal is to minimize P(t). 

The sum-of-squares metric. In terms of measuring anonymity, an ideal metric is one that is sensitive 
to outliers while still being easy to work with. The reason that outlier sensitivity is useful is that focuses 
our attention on the areas where the adversary, Bob, can gain the most advantage. The above anonymity 
measures, related to the L\ and metrics, have some sensitivity to outliers, but we would like to use a 
metric that is more sensitive than these. 

For this paper, we have chosen to focus on a metric for anonymity that is derived from a simple measure 
that is well-known for its sensitivity to outliers (which are undesirable in the context of anonymity). In this, 
the sum-of-squares metric, we take the sum of the squared differences between the given distribution and 
our desired ideal. In the context of buffer shuffling, this would be defined as follows: 

n 

*(*) = £>i(t) - 1/n) 2 , 

8=1 
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which can be further simplified as follows: 

n 

m = ^K 2 (t)-2 W! W/« + l/n 2 ) 
i=l 

= (f> 2 (*)) - 

This amounts to the square of the L2 -distance between the iy-distribution and the uniform distribution, U. 
The goal is to minimize &(t). 

Incidentally, this simplified form of the sum-of-squares metric, for the buffer shuffle, doesn't take 
into account possible correlations between pairs of items, but we show in this paper how to extend the 
sum-of-squares metric to these contexts as well. 

Relationships between anonymity measures. Another benefit of the Q(t) metric is that it can be used 
to bound other metrics and measures for anonymity, by well-known relationships among the L p norms. 
For instance, we can derive upper bounds for other metrics (which we leave as exercises for the interested 
reader), such as 

a(t) < $(t) 1/2 and p(t) < {n<$>{t)) 1/2 /2. 
And we can also derive lower bounds for other metrics, such as 

$(i) 1/2 /2 < P(t) and ^{t) 1 ' 2 /n 1 / 2 < a{t). 

In addition, even though Anon' t is not a metric, we can derive the following bound for it, since AnonJ < 

a(t) + 1/n: 

AnonJ < $(t) 1/2 + 1/n. 
So, for the remainder of this paper, we focus primarily on the <E>(i) metric. 

3 Algorithms and Analysis 

Our parallel mixing algorithm repeats the following steps: 

1 . Shuffle the cards, placing them according to a uniform permutation. 

2. Under this ordering, divide the cards up into consecutive groups of K = n/M cardsQ 

3. For each group of K cards, shuffle their cards randomly, hidden from the adversary. 

We refer to each repetition of the above steps as a round. In the parallel mixnet setting, each group of K 
cards would be shuffled at a different server. 

Let Wi(t) be the probability that the ith card after t rounds is the first card from time from the point 
of view of the adversary. (We drop the dependence on t where the meaning is clear.) Initially, w\ = 1, and 
1V2 ■ ■ ■ w n are all 0. Motivated by [6], let <£(i) be a potential function <&(t) = (J2 Wi(t) 2 ) — ~, based on the 
sum-of-squares metric, and let A$(i) = 3>(i) — 3>(i + 1). (Again, we drop the explicit dependence on t 
where suitable.) 

Our first goal is to prove the following theorem. 

1 As we also study, we could alternatively assign each card uniformly at random to one of the M = n/K piles, with each group 
getting K = n/M cards in expectation. 
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Theorem 1: A non-corrupted parallel mixnet, designed as described above, has E [$(<)] < K^ t . In partic- 
ular, such a mixnet, with K > n l / c , can mix messages in t = be rounds so that the expected sum-of-squares 
error, E[$(t)], between card-assignment probabilities and the uniform distribution is at most l/n b , for any 
fixedb> 1. 

Before proving this theorem, we note some implications. From Theorem [T] and Markov's inequality, using 
t = 2bc rounds, we can bound the probability that 3>(i) > 1/n 6 to be at most 1 / n b , for any fixed b > 1. So, 
taking b = 2 implies at < 1/n with probability 1 — 1/n 2 , taking 6 = 3 implies f3 t < 1/n with probability 
1 — 1/n 3 , and taking 6 = 2 implies AnonJ < (n — 1) , with probability 1 — 1/n 2 , which achieves the 
anonymity goal of Golle and Juels [7] (who only treat the case c = 2). Therefore, a constant number of 
rounds suffices for anonymously shuffling the inputs in a parallel mixnet, provided servers can internally 
mix K > n l / c items, for some constant c > 1. 

We now move to the proof. Let A<£* represent how the potential changes when a group of K cards is 
shuffled during a round. For clarity, we examine the cases of K = 2 and 3 before the general case. 

• For 2 cards with incoming weights W{ and Wj (outgoing weights are the average): 

A$* = w 2 + w 2 -2({w i + w j )/2) 2 
= (wi-Wj) 2 /2. 

• For 3 cards with incoming weights W{, Wj, and w k - 

A<£>* = w 2 + w 2 + w\ - 3(0; + Wj + w k )/3) 2 

= (wi - Wj) 2 /3 + (wj - w k ) 2 /3 + (w k - Wi) 2 /3. 

• For K cards with weights wn, u>i2, ■ ■ ■ , WiK'- 

i K 

l<j<k<K 

We now proceed to bound E[5>(i)] by making use of A<3?. 



E[A$] = — V] Pr ((i, j) are in the same set of K cards) (uii — uij) 2 

l<l<j'<n 
K-l \- , 2 

K-l 



; ^ (W4 — WjY 

2K{n-\) ./r!. V 3> 



l<i,j<n 

Also, 

E[A*/*] - K ~ 1 ((««*- Vn) -K-l/n)) a 



2^(n-l) EfcOfc-V^r 
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Let Xi = Wi — l/n to get 



E[A*/*] = K ~ 1 gi>^! 

2tf(n-l) ' 



Interestingly, when K = n, we should have mixing in one step, so in this case E[A<£/<&] should be 1. Notice 
if that is the case, then perhaps surprisingly the above expression is independent of the actual x,- L values, and 
then we have immediately: 

We can in fact confirm this easily. Since J2k x k = 0, we have 

E (xi -xjf = J2(xi-Xjf + 2 I Y, x k J = 2n^x|, 

i,j ij \ k / k 

and cancellation gives the desired result. 

This analysis also gives us fast convergence to the uniform distribution in the general case. Let 7 = 
jyjfoli) » an d note 7 < L In particular, 

n-if 1 
~ 7 ~ 1) < K' 

Also note $(0) < 1. So we have 

E[$(t + l)] = (l- 7 )E[$(t)], 

and a simple induction yields 

E[$(t)] = (l-7)*$(0) < X - *. 
The rest of the theorem follows easily. 

3.1 Extensions to Mixnets with Corrupted Servers 

In the case of there being corrupted servers, Bob will know the permutation for the cards assigned to each 
such server. In terms of the analysis, we can treat the permutation for each corrupted server as the identity 
operation, since Bob can simply undo that permutation. Let us suppose, then, that there are M = n/K 
servers, so that each obtains K cards in each round, and that 1 < s < n/K servers are uncorrupted. 
Following our previous analysis, we find 

1 



E[A$] = - V Pr ({hj) are i n the same uncorrupted server) (wi — Wj) 
K — ' 

l<i<_;'<n 

(K-l) s _ , 2 



J2{Wi-Wj) 



K(n-l)n/K t< . 
s(K - 1) v-^ , n2 

= ^7 \ > (Wi - WjY. 

2n (n-1) .Jrf. V 3 ' 
Again, based on our previous analysis, we have 

E[A$/$] = S ^~^ 
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Now let 7' = 7j^~iy i if> f° r example, s = ej^tj then 1 — 7' = 1 — e. In that case, 

E[*(t)] = (1 - e)«. 

Theorem 2: A corrupted parallel mixnet, designed as described above, with s > e(n — 1)/ (fT — 1) non- 
corrupted servers, for e > 1/2, can mix messages int = b log n rounds so that the expected sum-of-squares 
error, E[$(i)], between card-assignment probabilities and the uniform distribution is at most l/n b , for any 

fixed b > 1. Likewise, if there are at most n — k(k~i) corr upted servers, with K > n 1 ^ for some 

constant c > 1, then in t = be rounds it is also the case that E[$(t)] is at most l/n b , for any fixed b > 1. 

Thus, by Markov's inequality, using t = 26 log n or t = 26c rounds, depending on the number of 
uncorrupted servers, s, we can bound the probability that &(t) > l/n b to itself be at most l/n b , for any 
fixed b > 1. 

As an instructive specific example, suppose K = M = y/n, and there are a constant z servers that are 
corrupted. Then 



1 — e = 1 — ( \Jn — z) = 1 -j= — - = —j= — -. 



Hence, in this specific case, 



z + i V (z + iy 



E[*(t)] = < 



and for any constant 6 after 46 rounds we have that 3>(i) < with probability P{n~ b ). 

As our expressions become less clean in our remaining settings, we state a general theorem which can 
be applied to these settings in a straightforward way: 

Theorem 3: Given a parallel mixnet with corrupted servers or adversarially generated inputs, let 7 = 
E[A<I>/$] in tnat setting. Then in t = 61og 1/ / ( - 1 _ 7 ) n rounds the expected sum-of-squares error, E[<&(t)], 
between card-assignment probabilities and the uniform distribution is at most l/n b , for any fixed 6 > 1. In 
particular, if 7 > 1/2, at most 6 log n rounds are required; if 7 > 1 — n~ l / c , at most be rounds are required. 



3.2 Extensions to Mixnets with Corrupted Inputs 

For the case of corrupted inputs, Bob will be able to track those cards throughout the shuffle process. In the 
shuffling setting, we can think of some number of the cards as being marked — no matter what we do, Bob 
knows the locations of those cards. In terms of the analysis, we can treat this in the following way: when 
we have a group of K cards, it is as though we are shuffling only K' < K cards, where K' is the number 
of unmarked cards in the collection of K cards. Let us suppose that / < n — 2 cards are marked. Note that 
we may think of Wi as being for any cards in a marked position; alternatively, without loss of generality, 
let us calculate at each step as though Wi is non-zero only for i = 1 to n — f. (Think of Wi as being the 
appropriate value for the ith unmarked card.) Note that, for consistency, we must have 

(n-f 

*(*)= (5>i(t) 2 
Following our previous analysis, we find 

E[A< i>] = rl yprfb i) ^ in the same set of K ' ) ( Wi - w tf 

jTT 1 ^ K\ ^ c V out of K unmarked cards / 

K—2 l<i<]<n—f 
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E( U K')(k-K') K ' - 1 v~7 \2 

(?) JP(»-l)fe ( "' " 3) 

l<ij<n-f 



2(n - 1)(J) V^7 V n " / " K 7 ^' 

We can then compute E[A$/$] as 

1 (*(K\( n-K \K'-l\J2 1 < iJ < n _ f (x i -x j ) 2 



2 



2 ( n - l )Q\£t2\ K ')\ n -f- K ') K ' ) El<fc<«-/^ 
Following the same computations as previously, we have 

Note the n — f term in the numerator in place of an n. 

Now let v equal the right hand side above; then we have 

E[*(t)] = (1 - v)\ 

In particular, it is clear that v < j^zn > so tne convergence of E[$(t)] to happens more slowly than in 
the case without corrupted inputs, as expected. Nevertheless, we can still derive a theorem analogous to 
Theorem [2] using Theorem[3]and the above characterization of E[3>(i)]. We omit a full restatement for space 
reasons. 

3.3 Extensions to Mixnets with Corrupted Servers and Inputs 

One nice aspect of our analysis is that combinations of corrupted servers and inputs are entirely straightfor- 
ward. In this setting, we have 

K j / (i, j) are in the same set of K' \ 

E[A$>] = ^2 — ^2 P r I out of K unmarked cards at an J (w{ — wj) 2 
K'=2 i<i<j<n—f \ uncorrupted server / 

K - (n 'J){ K l K ) K'-l 



E s \K')\K-K'J zv — A \T^/ \S 
sK ( K >) ( n -f-K') K' - 1 ^ 

T)R — m ^X>-^) 

K 



\K') \n-f -K') -ft — 1 („„. „„.\2 



2 

uJi — ujj) 

l<i,3<n-f 



Hence 



e[a*/*i = sK{n ~ f) (y ( K )( n ~ K 



Given this bound, we can then derive a theorem analogous to Theorem[2]for the case when mix servers 
can be corrupted and the adversary can inject fake messages using Theorem [3] We omit a full restatement 
for space reasons. 
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3.4 Extensions to Mixnets with a Less Powerful Global Step 



Under our model, we can also consider a weaker global step between rounds, where the cards must be split 
evenly into n/K groups of size K for the n/K servers. Instead, let us assume simply that each server 
obtains each card independently with probability K/n; this requires less synchronization of the messages 
between rounds. For example, previously we have assumed that the encryption step taken by each server on 
each round is used to provide the random permutation that maps messages to servers, and that a global step 
sorted the re-encrypted messages in order for each server to obtain n/K messages. Assume instead that 
the encrypted messages are mapped to values, which can be assumed to be uniformly distributed over their 
range, and the range for such messages is divided into n/K equally sized subranges, one for each server. 
Then the server for each message for each round is determined without a complete sort (indeed, at the end of 
reach round the behavior is like the first step of a radix sort on random inputs). Assuming K is sufficiently 
large (e.g., at least clogn for a suitable constant c) then Chernoff bounds yield that each server will obtain 
at most (1 + e)K cards in each round with high probability; hence, the maximum shuffle size can still be 
bounded. 

Extending the analysis above, we have that E[A<J>] equals 




J J 



K'=2 l<i<j<n-f 



/ are in the\ 
same set of K' 
out of J un- 
marked cards at 
an uncorrupted 
server 



(Wi - Wj 




1 ! s (#)(',) K'-l 



E 



n/ J Q K'(n - 1) 



i<j 



K\ J sJ ' (j>)L-f-K>) K'- l 
n) „(„-!) 2^ («) K 




i<j 

n-K \K'-l\ 



n-f-K'l K' 



l<i ,j<n— f 



Generally, since for K sufficiently large we will have concentration of the number of cards arounds its 
mean, the slowdown from the random distribution can be bounded by a small constant factor. 



4 Conclusion and Open Problems 

In this paper, we have provided a comprehensive analysis of buffer shuffling and shown that this leads to 
improved algorithms for achieving anonymity and unlinkability in parallel mixnets. An interesting direction 
for future research could be to extend this analysis to other topologies, including hypercubes and expander 
graphs. 
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A Additional Anonymity Measures 



There are several additional measures that Alice can use to determine when she is close to placing her cards 
in random order in a fashion that obscures that order from Bob. As with the Anon t measure, the ones we 
review here are also not metrics. 



A.l A; -Anonymity 

One measure of obfuscation is k-anonymity [16]. In the context of the anonymous shuffling problem, we 
would say that Alice's shuffling algorithm achieves A;-anonymity if, from Bob's perspective, every card in 
Alice's final output has at least k input cards that have a possibility of being mapped to that card during the 
shuffling. This measure has been used in several applications in computer security and privacy, including 
mixnets ifTTl [T8l . A well-known weakness of fc-anonymity (e.g., see (9j [121 [13]]), unfortunately, is that 
doesn't take probabilities into consideration. So, for example, if Bob has a 90% certainty about the identity 
of the top card after Alice's shuffle, with all the other cards sharing the remaining 10%, then we would say 
that the top card's identity had achieved n-anonymity (i.e., the maximum possible), even though Bob can be 
confident about which card is on top. Thus, we feel that fc-anonymity is insufficient to use as a measure of 
obscurity for the anonymous shuffling problem. 



A.2 Relative entropy 



Relative entropy measures, in bits, the amount of information that exists between two probability measures. 
For probability distributions P and Q, it is defined as 



D(P\\Q) 



Qi 



In the context of buffer shuffling, and the information present in the distribution, W, for a single card, 
compared to the uniform distribution, U, this amounts to 



D(W\\U) = X>i(t) log. 



Wj{t) 

1/n 



log 2 n 



^2wi(t) log 2 Wi(t) 



The goal for Alice would be to minimize the relative entropy, D(W||J7). While relative entropy is more 
sensitive to outliers than the distance notations, a t and f3 t , related to the and L\ metrics, and it measures 
information leakage in bits, which is a useful quantity, it is not always easy to work with. Moreover, it is not 
actually a metric, since it doesn't satisfy the symmetric property. 
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